Journal of Mathematical Biology manuscript No. 

(will be inserted by the editor) 

Erik Volz 

SIR dynamics in structured populations with 
heterogeneous connectivity 

Received: September 4, 2005, Revised; February 2, 2008 - © Springer- Verlag 2008 
Abstract. Most epidemic models assume equal mixing among members of a pop- 
ulation. An alternative approach is to model a population as a random network 
in which individuals may have heterogeneous connectivity. This paper builds on 
previous research by describing the exact dynamical behavior of epidemics as 
they occur in random networks. A system of nonlinear differential equations is 
presented which describes the behavior of epidemics spreading through random 
networks with arbitrary degree distributions. The degree distribution is observed 
to have significant impact on both the final size and time scale of epidemics. 

1. Introduction 

Contact patterns constitute an important aspect of heterogeneity within a popu- 
lation of susceptible and infectious individuals. It has also been one of the hardest 
factors to incorporate into epidemiological models. Compartment models have 
been able to capture many aspects of population heterogeneity, such as with re- 
spect to heterogeneous susceptibility and infectiousness f |27l3l9| '). But compart- 
ment models can be inadequate with respect to population structure, especially 
when contact rates follow a steep and continuous gradient. 
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Network theory describes a population of susceptible and infectious individ- 
uals as nodes in a network f |17l26l2()ll8| 'l. This has spawned a new category of 
epidemiological models in which epidemics spread from node to node by traversing 
network connections f |22ll8l21l28l8l2H] '). Pure random networks with specified 
degree distributions have been proposed as realistic models of population struc- 
ture. This case has the advantage of being well understood mathematically. The 
limiting behavior of epidemics spreading through random networks with a given 
degree distribution has been solved exactly (' |18I21| '). 

The network approach has the advantage that the mathematics of stochastic 
branching processes f |29ll5ll) 'l can be brought to bear on the problem. This allows 
for precise descriptions of the distribution of outbreak sizes early in the course of 
the epidemic, as well as a solution for the final size of epidemics (' |18I21| '). 

A shortcoming of the network model has been that stochastic branching pro- 
cesses are inadequate to describe the explicit dynamical behavior epidemics. Thus 
the distribution of outbreak sizes are easy to solve for, yet the incidence curve, 
that is the number of infecteds at a time t, has been difficult to derive. Simulation 
has been used in this case flllp. 

Heterogeneous networks make it difficult to derive differential equations to 
describe the course of an epidemic. Nevertheless, several researchers ( f 51231241 
I7I1U| ') have been successful modeling many of the dynamical aspects of network 
epidemics, particularly in the early stage where asymptotically correct formuli for 
disease incidence are now known. We improve upon these results by presenting 
a system of nonlinear differential equations which can be used to solve for the 
complete incidence curve, as well as other quantities of interest. We treat the 
simplest possible case of the SIR dynamics with constant rate of infection and 
recovery. Section |21 describes the model. Several examples are given in section |H] 
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$ ^ a fi pw 

a = —a{r + fi)pw 

T = -{r + fi)pwT - pw r n ag' (a + /?) 
W = pw{r n ag"{a + /3) - (r + ij){2W + T)) 
Table 1. A summary of the nonlinear differential equations used to the describe 

the spread of a simple SIR type epidemic through a random network. The degree 
distribution of the network is generated by g(x). 

2. The model 

We investigate undirected random networks with specified degree distributions^ |2ti| 
Let pk be the probability of a node having a degree k. As in previous research we 
will make great use of the probability generating function (PGF) corresponding 
to the degree distribution. 

Although widely employed in the probability theory and the study of stochas- 
tic branching processes, generating functions are less familiar to those working in 
mathematical epidemiology (but see |till2lll2| 'l. The utility of PGF's for the cur- 
rent investigation cannot be understated. Consider the degree distribution among 
susceptibles at a given time t. As an epidemic progresses, more highly connected 
nodes, often called "hubs", will be preferentially culled from the population of 
susceptibles. Thus the degree distribution among susceptibles will evolve as the 
epidemic progresses. Our approach will be to keep track of the evolution of this 
distribution by careful application of parameters to the PGF. This will ultimately 
allow us to find the number of infecteds at any given time. 

^ The degree of a node in a network is the number of connections to that node. 
The degree distribution is a discrete probability density over the positive integers 
describing the probability of realizing a given degree. 
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Given a degree distribution, we define the probability generating function g{x) 

as 

g{x) = po + pix + p2x' + psx'^ + ■ ■ ■ (1) 

fn most cases tliis series will converge to an algebraic function, in which case any 
operation to be done on the PGF can be done on the simple algebraic form. The 
series form can be retrieved by Taylor expansion. The degree distribution is a 
parameter of the model, so g must be well-defined. Several examples for common 
distributions are given in section The results given below hold for any degree 
distribution. 

It will be helpful to the reader if several examples are provided to illustrate 
the utility of PGF's. Generating functions allow us to manipulate probability 
densities using simple algebraic operations. For example, if we were to draw two 
realizations of a random variable X with generating function g{x), the density of 
the sum would have generating function + P2Pk-2 + • ■ ■ = g^{x). 

The mean of the random variable can be be computed by differentiating the 
generating function, < X >= kpk = <7'(1)- 

Another example more apropos to this paper is the following: Suppose we 
select a fraction a of the stubs'^ from a network whose degree distribution has 
generating function g{x). Then what proportion of nodes will not be attached to 
any of the stubs we selected? 

- a)* = g(l - a) 

k 

Meanwhile the degree distribution of those not attached to a selected connection 
is generated by 

- a)x) 
g{l-a) 

We can do better by computing the explicit generating function for the joint degree 
distribution of selected and unselected stubs. This is accomplished by applying 
^ In network vernacular a stub is one end of a connection between two nodes. 
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a second variable to the generating function. Let x correspond to selected stubs 
and y correspond to unselected stubs. The probability of a degree k node having 
m of its stubs selected is (^)a"'(l — a)*""*. Then the generating function will be 
of the form 



where c = g{a + (i) is & normalizing constant. This example is important, as 
it underlies the methodology employed in this paper. The situation would be 
identical if infection had spontaneously spread among a fraction a of the stubs 
and we asked how many nodes remained uninfected. 

We will use an indirect approach in that we will not track the evolution of 
susceptibles and infecteds directly, but rather the number of stubs which are 
attached to susceptibles and infecteds. When an infected node transmits infection 
along one of its connections, wo say the corresponding connection is occupied. 
The variable T will be the number of stubs emanating from susceptible nodes 
which are not paired with an infected or refractory alte. The variable W will be 
the number of stubs emanating from infected nodes which have not yet become 
infected or refractory. We will treat the simple case of a constant force of infection 
and constant recovery rate. The quantities of interest in the model are as follows: 

— r ~ Force of infection. The probability per unit time of infection traversing a 
network connection. 

— H := Recovery rate. The probability per unit time that a connection to an 
infected will become refractory. 

— n := The population size. 

— z := The average degree in the network. 

— T := The number of all network connections to susceptible nodes which have 
not become refractory. 



g{x,y;a) = 2^ 2^ Pki a (1 - a) x y /c 



k m— 




= ^Pfe(aa; + (1 - a)y)''/c = g{ax + (1 - a)y)/c 



k 



6 



Erik Volz 



— W := The number of all network connections to infected nodes which have 
neither become occupied nor become refractory. 

— a := The proportion of stubs not connected to an occupied or refractory stub, 
i.e. the survivor function of susceptible stubs. 

— f3 :— The proportion of stubs among susceptible nodes which are connected 
to refractory stubs. 

— S := The number of susceptibles. 

— I := The number of infecteds, including those in a refractory state. 

It is important not to confuse stubs and connections. Two stubs are paired 
to form a connection. Stubs can be dormant, can be infected (infection has been 
transmitted by the stub to its alter), or can be refractory. In particular, it is 
possible for one stub to bo refractory while its alter is infected. However if just 
one stub in a connection is infected, we say the corresponding connection is occu- 
pied. The dynamics proposed below do not keep track of the number of occupied 
connections, but rather of the number of stubs paired with infected or refractory 
alters. This is a pragmatic approach, as a susceptible can be defined as a node for 
which all of its stubs are not connected with infected alters. 

During the course of an epidemic, a node may be connected to a refractory 
stub, an infected stub, or a dormant stub. The different types of connections 
can be factored into the generating function by using multiple variables. Let the 
variable x correspond to the number of stubs paired with dormant alters, and y 
correspond to the number of stubs paired with refractory alters. Note that at any 
given time, a susceptible will not have any stubs connected to an infected alter by 
definition. Since we are only interested in the degree distribution of susceptibles, 
we will not introduce a variable for the number of infected stubs. 

For susceptibles, stubs will be distributed among refractory connections and 
unoccupied/non-refractory connections. As defined above, a is the probability of 
having the latter type of connection, while /3 is the probability of the former. The 
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generating function for the degree distribution among susceptibles will be 

J2 Pkic'X + liyfic = g{ax + (5y) / g{a + (5) (2) 

fe 

The quantity T is easy to derive by similar logic. The probability of a node having 
degree k and contributing m stubs to T is 

So in terms of the PGF, the number of stubs emanating from susceptibles which 
do not have refractory alters will be 

r = [g{ax + ^nag'{a + l3) (3) 

a and (3 will change over the course of the epidemic, thereby controlling the 
evolution of the degree distribution It remains to determine the dynamics of 
these parameters. At any given time, the hazard rate for an unoccupied stub being 
connected to an infected stub is rpw, where pw = W/ (W + T) is the proportion 
of non-refractory /unoccupied stubs connected to infecteds. Likewise, the hazard 
rate for becoming connected to a refractory stub is fipw ■ Recall a is the survivor 
function for stubs not connected to occupied or refractory stubs; thus its dynamics 
is governed by 

d — —a{r + fi)pw (4) 

The evolution of f3 is more complicated. The probability of a stub connected 
to a susceptible node surviving to a time t is of course a. At time t, the hazard 
of connecting to a refractory stub is fipw ■ Then we have the following: 

P = a ^j, pw (5) 

The dynamics of W is dependent both on the outflow of stubs becoming occu- 
pied and refractory, plus the inflow of stubs from newly infected nodes. Note that 
the total degree mass of the network, M = nz is conserved. If we denote by X the 
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stubs which are either occupied or refractory, we have the identity W — M — T —X . 
Differentiating gives W . 

W = -f~X (6) 

X is quite simple. When a network connection becomes occupied or refractory, 
the two stubs making up the connection change state. Then X increases at twice 
the rate at which stubs from W become refractory or occupied. 

X = 2{r + ii)W 

Differentiating equation ^ and using equation ||1J gives 

f = -{r + tj)pwT -pw r n o? g"{a + P) (7) 

Finally, combining equations ©, ©, and Q we have 

W = {r + ^i){pwT ~ 2W) + pw r n g" (a + fi) (8) 
= pw{r n ag"{a + /?) - (r + mu)(2W + T)) (9) 

This completes the model. 

Once the model has been integrated the number of susceptibles can be deter- 
mined by applying the PGF to distribution parameters a and (3. At a given time 
t, the number of susceptibles S is 

S-nY^Y.pJl\a-p'- (10) 

k m=0 \ / 

= nY^pk{a + pf =ng{a + l3) (11) 

k 

The number of infecteds including those who have recovered is / = n — 5. 
3. Examples 

The model has been tested on several common degree distributions: 
— Poisson: pk ~ ^-p — . This is generated by 

5(x)=e^(^-i' (12) 
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— Power-law. For our experiments, we utilize power-laws with exponential cutoffs 
k: pk — ^. , fc > 1 where Lin(x) is the nth polylogarithrn of x. This is 
generated by 

g(x) = Li-,{xe-^^'')ILi^{e-^l'') (13) 

— Exponential: = (\ — e~^^ ^')e~^^ . This is generated by 

If a single node is chosen at random from the population and infected, we can 
anticipate the following initial conditions; The survivor function for uninfected 
stubs, a, will begin at 1 and evolve downwards. /9 will begin at and evolve 
upwards. T will be equivalent to the degree mass of the network minus the degree 
of the initial infected. And W will be the degree of the initial infected. We take the 
degree of the initial infected to be the average degree within the network. These 
are the initial conditions used in the trials shown in figure and |5| 

FigureQshows the disease incidence for each of the degree distributions 11211 , (1131 , 
and dJ, with a force of infection r = .2 and mortality /i = .1. The parameters of 
the degree distributions were chosen so that each network has an identical average 
degree of 3. That is, the density of connections in each network is the same. Nev- 
ertheless, there is widely different epidemic behavior due to the different degree 
distributions. 

A sense for the different dynamical behaviors of each of the three networks can 
be gotten from figureQ Consistent with previous research, the degree distribution 
has a great impact on the final size of the epidemic ( llbl^ll *). More importantly, 
the three networks exhibit widely varying dynamical behavior. In particular, note 
that the power law network experiences epidemics which accelerate very rapidly. 
Such epidemics enter the expansion phase virtually as soon as the first individual 
in the network is infected. Both the Poisson and exponential networks experience a 
lag before the expansion phase of the epidemic. These observations are consistent 
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Fig. 1. The number of infecteds (including recovered) is shown versus time 
for an SIR model on three networks. Force of infection and mortality are con- 
stant: r = 0.2, /i — 0.1. The networks have Poisson {z — 3), power law 
(7 = 1.615, K = 20), and exponential (A — 3.475) degree distributions. Each 
of these degree distributions has an average degree of 3. 

with the findings in (^_) that the timescale of epidemics shortens with increas- 
ing contact heterogeneity. Pure power laws have an infinite second moment, and 
therefore have a minimally short time-scale. This has important imphcations for 
intervention strategies, as it is often the case that interventions are planned and 
implemented only after a pathogen has circulated in the population for some time. 
If an epidemic were to occur in the power-law network, there would be little time 
to react before the the infection had reached a large proportion of the population. 

Several other variables of interest are computed as a byproduct of the model. 
Figure |21 shows the most important for the power law trial described above, a 
shows the proportion of stubs not connected to an occupied or refractory alter. (3 
shows the proportion of stubs among susceptibles connected to a refractory alter. 
These variables do not quite move in tandem and may cross each other. Also 
shown is W (rescaled by population size n) which is similar to the hazard rate 
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Fig. 2. a, /3, and W/n are shown versus t for a power law network with exponent 
K = 1.615 and exponential cutofT k — 20. Force of infection and mortality are 
constant: r — 0.2, jj, — 0.1. 



of becoming infected {rW/{W + T)). The epidemic ceases only when W reaches 
negligible levels. 

Something offered by this model and not to the author's knowledge seen pre- 
viously, is an explicit calculation for how the degree distribution of susceptibles 
evolves over the course of the epidemic. The infection will clearly tend to strike 
more highly connected individuals before more isolated individuals. Thus we ex- 
pect the degree distribution to become bottom-heavy, as high degree nodes are 
gradually weeded out of the population. This is indeed observed in figure |3 for 
the Poisson trial described above. 

Recall that the degree distribution of susceptibles is generated by equation ||5J 
and that we retrieve the explicit degree distribution by differentiation: 



Applying this to the Poisson PGF (equation I12II ) gives 

\za) e 



Pk = 



fc! 



(15) 



(16) 



We recognize this as simply the Poisson distribution with an adjusted parameter 
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Fig. 3. The degree distribution (equation II Kill ) for susceptibles is shown at three 
different times during the course of an epidemic on a Poisson network (2 — 3). 
Force of infection and mortality are constant: r = 0.2, /i — 0.1. 

Previous work (\'21\) has shown that there is a critical transmissibility above 
which an epidemic will reach a fraction of the population in the limit as n goes to 
infinity. Below that threshold, the epidemic is limited to a finite-sized outbreak. 
Figure |1| shows the qualitatively different dynamical behavior of outbreaks below 
the phase transition for networks with a Poisson distribution. Note that these out- 
break sizes are independent of the population size, n, in contrast to the incidence 
curves above the phase transition which are sensitive to n both in the time-scale 
of the epidemic and the number ultimately infected. 

Define the transmissibility, r, of the disease as the probability that the infec- 
tion will traverse a network connection between and infected and a susceptible^. 
With constant force of infection and mortality 

r 

T = 

r + jj, 

What is the critical transmissibility that defines the phase transition? Recall that 
the epidemic is complete when W is negligible and decreasing. If W is decreasing 
at t = then the epidemic will necessarily die out without reaching a fraction of 

^ r is related to to the traditional Ro through the degree distribution. See |18| 
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Fig. 4. The number of infecteds (including recovered) is shown versus time 
for an SIR model on a Poisson network {z — 3). Each of these trials are below 
the critical level of transmissibility required to sustain an epidemic. Mortality is 
constant, /i = 0.4, while three different levels of the force of infection are tried, 
r = 0.15,0.17,0.18. 

the population. The critical point occurs where 

Wt=o = = -f-X 



Applying equations Q and Q 
aW 



= 



[(r + + P)+ar g" {a + /3)] - 2(r + fi) 

-Q^ n g"{a + (3) 



W + T 

r + fj, / an 



r \W + T 



g'{a + l3)-2) = 



W + T 
2W + T 



r + ^ n g" {a -\- (3) 

At t = 0, a = 1, /? = 0, and T n g'{l). Then 

r* =g'{l)/g"{l) 



(17) 



This is in agreement with previous results based on bond-percolation theory (|21|'). 



4. Discussion 

The statistical properties of SIR epidemics in random networks have been un- 
derstood for some time, but the explicit dynamics have been understood mainly 
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through simulation. This paper has addressed this shortcoming by proposing a 
system of differential equations to model SIR in random networks. 

It should be noted that the SI dynamics are a special case of this model 
(/i — 0), in which case the ultimate extent of the epidemic is simply the giant 
component f|19)'l^ of the network. 

The distribution of contacts, even holding the density of contacts constant, 
has enormous impact on epidemic behavior. This goes beyond merely the extent 
of the epidemic, but as shown here, the dynamical behavior of the epidemic. In 
particular, the distribution of contacts plays a key role in determining the onset 
of the expansion phase. 

The distribution dynamics from equation (|5J and shown in figure|3|have impor- 
tant implications for vaccination strategies. Previous work f |16ll4l 'l has focused 
on determining the critical levels of vaccination required to halt or prevent an 
epidemic. It is usually taken for granted that contact patterns among suscepti- 
bles are constant. Furthermore, most widespread vaccinations occur only once an 
epidemic is underway. Future research could be enhanced by considering optimal 
vaccination levels when the epidemic proceeds unhindered for variable amounts 
of time. 

It is hoped that the distribution dynamics described in this paper will find 
applications beyond modeling heterogeneous connectivity. The dynamic PGF ap- 
proach may be used to capture other forms of heterogeneity, such as of suscepti- 
bility, mortality, and infectiousness. 



* The giant component of a network, if it exists, is the largest set of nodes such 
there exists a path between any two of them; furthermore the giant component 
must occupy a fraction of the network in the limit as network size goes to infinity. 
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